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This is an old article (from May 2004 )> that will probably not be published, because a much improved 
paper with new results is in preparation. Still, I decided to put it in the archive because there are 
some things of interest here (in particular, the section on the S-K model) which will not appear in 
the new paper. 

Abstract 

We present a simple extension of Lindeberg's argument for the Central Limit Theorem to 
get a general invariance result. We apply the technique to prove results from random matrix 
theory, spin glasses, and maxima of random fields. 

1 Introduction and results 

J. W. Lindeberg's elegant proof of the Central Limit Theorem |151ll6j . despite being in the shadow 
of Fourier analytic methods for a long time, is now well known. It was revived by Trotter |25j 
and has since been used successfully to derive CLTs in infinite dimensional spaces, where the 
Fourier analytic methods are not so useful. For more information on this topic, see the survey 
paper [3] and the monograph ^U]. (Another possible source is Bergstrom's books jlJ|S]. It is also 
worth mentioning that LeCam Jl] had a similar idea for Poisson approximation.) The ideas were 
carefully examined and generalized by Zolotarev [2E] through the introduction of the so-called £ 
metrics, which we shall not discuss here. 

However, it seems that the basic method of replacing non-Gaussian random variables by Gaus- 
sians one by one and using Taylor expansion to get approximation bounds has been applied only 
for proving central limit theorems for sums of independent random elements, and its potential for 
proving more general invariance results has been overlooked in the literature. (After the prepara- 
tion of the initial draft of this article, it came to our notice that indeed, there is an old article of 
Rotar j2J which examines the Lindeberg method polynomial maps in a limiting case. Also, earlier 
this year, Mossel, O'Donnell, and Oleszkiewicz ^Sj made some striking applications to problems 
from computer science and discrete mathematics using the Lindeberg method on polynomials.) 

We shall derive a very simple extension of Lindeberg's argument to obtain a result for general 
smooth functions. Basically, we shall show that if / : M. n — > K is a function such that reasonable 
fluctuations in any single coordinate (keeping others fixed) do not affect the value of the function in 
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a "big" way, then the distribution of f{X\, . . . , X n ), where AQ's are independent random variables, 
depends mainly on the first two moments of the Xj's. 

To make things precise, we first need a suitable measure of the largest possible influence of any 
single coordinate on the outcome. 

Definition 1.1. For any open interval I containing 0, any positive integer n, any function f : 
I n — > C which is thrice differentiate in each coordinate, and 1 < r < 3, let 

A r (/) := sup{|cf /(x)|^ : 1 < i < n, 1 < p < r, x G I n } 

where df denotes p-fold differentiation with respect to the i th coordinate. For a collection 7 of such 
functions, define X r (1) := supj g3 r X r (f). 

Note that the interval / can be bounded or unbounded. The numbers A r (/) jointly constitute a 
measure of the maximum possible influence of the fluctuation in a single coordinate on the value of 
/ at any point in the set I n . We shall show that / will have the aforementioned invariance property 
when A2(/) and Xs(f) are sufficiently small. 

In this paper, we shall generally denote vectors by x, y etc. The i th component of x will be 
denoted by Xi, of y by yi and so on. 

In what follows, X = (X\, . . . , X n ) and Y = (Y\, . . . ,Y n ) are two vectors of independent ran- 
dom variables with finite second moments, taking values in some open interval / and satisfying, for 
each i, KXi = KY{ and EX? = IEY^ 2 . We shall also assume that X and Y are defined on the same 
probability space and are independent. Finally, let 7 = max{E|Xj| 3 , E|Y| 3 , 1 < i < n}. Note that 
7 may be 00. 

Here is our main result: 

Theorem 1.1. Let f : I 11 — > M be thrice differentiate in each argument. If we set U = /(X) and 
V = f(Y), then for any thrice differ entiable g : M. — > R and any K > 0, 

n 

\Eg(U)-Eg(V)\ < C 1 (g)X 2 (f) ]T[E(^ 2 ; \X t \ > K) + E(lf; \Y t \ > K)] 

n 

+ C 2 ( ff )A3(/)^[E(|X 4 | 3 ;|X 2 | <^)+E(|K l | 3 ;|y 4 | < K)] 
i=i 

where Ci(g) = \\g'\\oo + \W'\\oo and C 2 {g) = \\\g'\\oo + \\\g"\\oo + gll^'lloo- 

The last term in the above bound is usually dealt with as follows: having chosen a suitable K, 
we use E(\Xi\ 3 ; \X { \ < K) < KE(Xf). When 7 < 00, we can do better: 

Corollary 1.2. In the setting of the above Theorem, if we further have 7 < 00, then \Eg(U) — 
MV)\ < 2C 2 ( 5 ) 7 nA 3 (/). 

For a quick example to see how Theorem 11.11 can be applied, consider the function /(x) = 
n -i/ 2 Y^-i Xi. It is very easy to compute A2(/) = n~ x and As(/) = n -3 / 2 . Now suppose Xj's are 
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i.i.d. and Yj's are also i.i.d. Further, assume EXi = EYi = and EX? = EY? = 1 for all i. Then 
taking K = e^/n and using Theorem ll.il we can easily get 

1 n 1 n 

\m^y2x t )-Eg(^Y j Y i )\ < C 1 (g)]R(Xi;\X 1 \>ey/n) 

2 = 1 J = l 

+ E(1?;|Y 1 | >e^)]+2C 2 ( 5 )e. 

Taking n — > oo, this proves the classical CLT since e is arbitrary. Furthermore, if we assume that 
E|Xi| 3 < oo and E|Yi| 3 < oo, then we also get an explicit error bound: 



V n- r-f V n V" 

For a more complicated example, consider the Stieltjes transform of a Wigner matrix. For a given 
z G C\R, define a function / as 

f(faj)i<i<j<N) = Ti((A((xij)) - ziy 1 ) 

where A((xij)) is the N by N matrix whose (i,j) th element is N~ 1 l 2 Xij if i < jf and N~ 1 / 2 Xji 
otherwise, and / is the N by N identity matrix, and "tr" stands for the trace of a matrix. In 
section we shall use Theorem 11.11 to obtain invariance results about this function, which will in 
turn yield the weakest known condition for convergence of spectral measures to Wigner's semicircle 
law. 

Another nontrivial example that we shall consider (in section I3J is the free energy of the 
Sherrington-Kirkpatrick model of spin glass theory. Here the function / is given by 



f((Xij)l<i<j<N) = "^log 



X, ex P{ ~m Yl XijOiOj + f3h^Oi} 



where the sum is taken over all cr = (cri, . . . , un) G {—1, 1}^, and /5, h are parameters. To deal with 
functions of this form, which commonly occur as free energy functions of various physical models, 
we have the following general Theorem: 

Theorem 1.3. Suppose J is a finite collection of coordinatewise thrice differentiable functions 
from I n into R, and a > 1. If F : I n — > R is defined as F(x) := a -1 logEj e3 re a -^ x )], i/ien 
A 2 (F) < 3aA 2 (2) and X 3 (F) < 13a 2 \ 3 (T). 

In section 13 we shall derive a condition under which the asymptotic behaviour of the free energy 
in the Sherrington-Kirkpatrick model is not dependent on the exact distributions of the entries. 
Our condition is weaker than the weakest known condition. In particular, it includes the "i.i.d. 
mean zero unit variance" case. 

Besides the possible applications to free energy functions as mentioned before, Theorem 11.31 
can have other important uses, as well. For example, the following result is an easy application of 
Theorems 11.11 and 11.31 
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Theorem 1.4. Let "5 be as in Theorem M.tA Let U = maxjgj- /(X) and V = maxjgj- /(Y). T/ien 
/or any thrice differentiable g : R — > M, any > 0, and any a > 1, we Ziawe 

|E 5 (C/) - Eg(V)\ < 2|| 5 r'||ooa- 1 log \?\ + 3aC 1 (g)\ 2 (?)T 1 (K) 

+ 13a 2 C 2 (g)\ 3 (?)T 2 (K) 

where T^K) = EtlP^M^I > *0 + E (*f; > *01 and T ^ K ) = £S=iPE( W; < 
K) + E{\Y i \ 3 ;\Y i \ <K)]. 

Again, we shall usually deal with T 2 (K) using E(|X| 3 ; \X\ < K) < KEX 2 . If 7 < oo, we have 
a more explicit bound: 

Corollary 1.5. In the setting of the above Theorem, if we further have 7 < 00, then 
\Eg(U) - Eg(V)\ < K( 5 )[(7nA 3 (J)) 1/3 (log |J|) 2 / 3 + jn\ 3 (T)] 

where K(g) = f\\g'\\oo + 13|| 5 "||oo + f WWoo- 

In section |IJ we shall demonstrate an application of Theorem 11.41 involving the energy of the 
ground state in the Sherrington-Kirkpatrick model of spin glasses. Essentially, we shall show that 
under the same conditions on the Xij's as in sectional the asymptotic behaviour of 

_/V~ 3//2 max XijO~iO~j, 

l<i<j<N 

where the maximum is taken over all <x € {—1, 1} , is not dependent on the exact distributions of 
the Xij's. 

For an immediate application, consider the (very old) question raised by Erdos and Kac [H]: 
what is the limiting distribution of maxi<j< n ^= Yll=i -^i where AVs are i.i.d. with mean zero 
and unit variance? It is now well known that the limiting distribution is the same as that of \Z\, 
where Z ~ iV(0, 1). Erdos and Kac proved the result for the case of the simple random walk; the 
general result could be proved only after Donsker established the weak invariance principle. Using 
Corollarv ll.5l we can easily establish concrete error bounds under finite third moments assumption 
for this problem. 

To work things out, let $ = {fc : 1 < i < n}, where /i(x) := n~ 1//2 E}=i x j- Clearly, Aa(3") = 
max!<j< n Xs(fi) = n -3 / 2 . Corollary 11.51 now gives the bound 

\Eg(U) -Eg{V)\ < i^^n^logn) 2 / 3 + in - 1 ' 2 } 

where U = maxi<j< n ^ Yl)=i x j and v = maxi<;<„ -J= J2)=i Y j- 

The three Theorems presented in this section are very general in applicability, and present a 
unifying approach to solving examples of the kind mentioned above, rather than applying different 
techniques for different problems. However, the method has its deficiencies, the greatest being that 
functions have to be smooth. This is a rather severe restriction, and eliminates a lot of interesting 
examples. For example, the method will not allow us to deal with non-smooth functionals like 
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stopping times (in the case of random walks) and empirical distribution functions (for random ma- 
trices). Smoothing approximations may sometimes give crude bounds. Furthermore, the restriction 
about the boundedness of derivatives hampers the applicability to many interesting functions like 
spectral radii of random matrices. Again, truncation techniques might work. 

The next three sections will be devoted to working out in detail the examples mentioned before. 
Proofs of the Theorems and Corollaries will be presented in the last section. 

2 Convergence of spectral distributions 

In this section, we shall illustrate the application of our method to proving invariance results about 
random matrices. Specifically, we shall derive the weakest known condition under which the spectral 
measures of a sequence of Wigner matrices converge to the semicircle law. We begin with a very 
short introduction to some material from the spectral theory of large dimensional random matrices. 

2.1 Spectral measures 

The Empirical Spectral Distribution (ESD) of a square matrix is the probability distribution on 
the complex plane which puts equal mass on each eigenvalue of the matrix (repeated by multiplic- 
ities). The limit of a sequence of ESDs is called the Limiting Spectral Distribution (LSD) of the 
corresponding sequence of matrices. The existence and identification of LSDs for various kinds of 
random matrices is one of the main goals of random matrix theory. 

For an excellent review of mathematical results known about limiting spectral behaviour and 
further references, see Bai Pj. For relevance in physics, see the book by Mehta |17j . 

2.2 Stieltjes transforms 

A standard tool for identifying the LSD of a sequence of random matrices is the Stieltjes transform. 
To cut a long story short, we can say that the ESDs of a sequence {An}^ =1 of random real sym- 
metric matrices converge in probability (w.r.t. the Prokhorov metric, for example) to a probability 
distribution G if and only if 

Vz G C\R, - Tr((Ajv - zI N y l ) -A / -— dG(s) 

where In is the identity matrix of order N. The expression on the right is the Stieltjes transform 
of G evaluated at z. Similarly, the expression on the left is the Stieltjes transform of the ESD of 
An, evaluated at z. Stieltjes transforms will be particularly useful for applying our technique, since 
they are infinitely differentiable as functions of the matrix entries. 

2.3 Wigner matrices 

A random Wigner matrix of order N is an N by N real symmetric matrix with independent entries 
on and above the diagonal. 

More specifically, consider the map A which "constructs" Wigner matrices of order N. Let 
n = N{N + l)/2 and write elements of lR n as x = (xij)i<i<j<jv- For any x 6 M. n , let -A(x) be 
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the matrix whose (i,j) th entry is N~ 1 / 2 Xij if i < j and N~ 1 / 2 Xji if i > j. If X is a vector of 
n independent standard Gaussian random variables, then ^4(X) is a standard Gaussian Wigner 
matrix. Wigner showed that the LSD for a sequence of standard Gaussian Wigner matrices is 
the semicircle law, which has density (27r) _1 V 4 — x 2 in [—2,2]. 

It was later shown that the distribution of the entries do not play a significant role: convergence 
to the semicircle law would hold under more general conditions (Cf. Arnold 1 , Grenander [10: 
and Bai [2]). The weakest known condition under which the convergence to semicircle law holds 
was given by Pastur [20]. It is claimed that the condition was shown to be necessary by Girko [§]. 
For a detailed exposition, see Bai [2] or Khorunzhy, Khoruzhenko and Pastur 

The method of this paper will give an easy way to show the sufficiency of Pastur's condition. 
Incidentally, somewhat similar ideas involving derivatives of empirical characteristic functions (in- 
stead of Stieltjes transforms) to get concenration bounds for ESDs have been explored in Chatterjee 
and Bose (Jj- 

2.4 Derivation of Pastur's condition 

To get started, fix z = u + iv E C, with u / 0. Define / : R n — ► R as 

/(x) ^iTr^x)-^)- 1 ). 

Also, define G : W 1 — ► C NxN as G(x) := (A(x) — zl)^ 1 . Now note that from matrix theory we know 
that inverting a matrix involves computing the classical adjoint and dividing by the determinant, 
which implies that the elements of the inverse are all rational functions of the elements of the 
original matrix. Also note that since all eigenvalues of A(x) are real, therefore det(A(x) — zl) ^ 0. 

Thus, G is infinitely differentiable along each coordinate. Also note that (A(x) — zI)G(x) = I for 

d 



each x. Thus for 1 < i < j < N, -^-[(^4 — zI)G] = 0, which gives 



dG n dA 

d — Gr. 



Also, note that higher order derivatives of A vanish identically. Combining everything we easily 
get 

df 1 ^ 9A „ 2 



n^~G% (1) 



dxij N dxiy 

92 f ~ 2 Tr( dA G dA G*) (2) 

dxfj N^^dxi^ dxij G dxij G ^' 3 

Now we need to find good bounds for the above quantities. For that, we need some preparation. 

For an N x N complex matrix B = ((bij)), the Hilbert- Schmidt norm (or Schur norm, or 
Euclidean norm) of B is defined as := (^ • \ bij\ 2 ) 1 / 2 . Besides the usual properties of a matrix 
norm, it also satisfies the following: 



6 



1. \Tt{BC)\ < \\B\ 

2. If U is a unitary matrix, then for any C of the same order, ||CJ7|| 



\UC\\ 



\c\ 



3. For a normal matrix B (i.e. B*B = BB*, B* being the conjugate transpose of B) with 
eigenvalues Ai, . . . An, and any C, max{||£?C||, ||C-B||} < maxi<j<Ar |Aj| • ||C||. 

The first property follows from the Cauchy-Schwarz inequality. The second is true because ||f/y||2 = 
||y||2 for any unitary matrix U and any vector y G M. N , where || • ||2 denotes the Euclidean norm 
on R . For the last one, note that any normal matrix B can be written as B = UAU* where U is 
unitary and A is diagonal, with the diagonal elements being the eigenvalues of B, and then apply 
the second property. 

The above facts are standard, and may be looked up in any standard text on matrix analysis. 
See Wilkinson PP- 55-58, for example. 

Now, it is easy to see that G and the derivatives of A are all normal matrices. Moreover, 
the eigenvalues of G are bounded by \v\ (where v = Im z) and the eigenvalues of dA/dxij are 
bounded by iV -1 / 2 . (Note that dA/dx^ is the matrix which has N~ l l 2 at the (i,j) th and (j,i) th 
positions, and elsewhere.) 

Thus, from the spectral representation of G 2 it follows that the elements of G 2 are bounded by 



This fact, and the identity © imply that 



df 



dxij 



< 2|d- 2 N- 3/2 . 



(4) 



Next, using the expression @ and the three properties of the Hilbert-Schmidt norm discussed 
above, we get 



d 2 f 



< 



2 

N 



OA 



dx 



i.i 



< 4\v\ 



-3 Ar -2_ 



Similarly, (J3J) gives 



d 3 f 



dx% 



< I2\v\- 4 N- 5 / 2 . 



(5) 



(6) 



From (jlj), © and © it follows that 

A 2 (/) 
Aa(/) 



4max{|w| 4 , 
12 max{|v| _f 



^ 3 }iV- 2 , 

|-4| jV -5/2_ 



Let X = {Xij)\<i<j<N and Y = (Yij)i<i<j<N be collections of independent random variables with 
zero mean and unit variance. Let U = Re /(X) and V = Re /(Y), and let g : M — > R be any thrice 
differentiable function. Note that Re / is a smooth function and A r (Re/) < A r (/) for each r. With 
K = ey~N, Theorem II . II immediately tells us that \Kg(U) — E,g(V)\ can be bounded by a multiple 
(depending only on g and v) of 



N~ 



\Xij\ > eVN) + E(Y 2 j; |^-| > eVtf)] + e. 



l<i<j<N 
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The same bound also works for functions of the imaginary parts. Using this result and Wigner's 
Theorem for Gaussian matrices, we see that convergence to the semicircle law holds whenever Jy's 
are independent with zero mean and unit variance, and satisfy 

Ve>0, lim N~ 2 ^ E(Xj; l-X^I > eVN) = 0. (7) 

This is exactly Pastur's condition, as mentioned before. The condition is satisfied, for example, if 
Xij's are i.i.d. with zero mean and unit variance. Also note that though this looks like Lindeberg's 
condition for the central limit theorem, it is not exactly that. 

3 Universality of a spin glass model 

In this section, we obtain a condition for invariance (or, as physicists say, universality) of the 
limiting free energy of the Sherrington-Kirkpatrick model of spin glasses. We begin with a short 
introduction. 

3.1 Spin glasses 

Let £jv = {-1, 1} N . This is the space of all possible spins of N particles in statistical mechanics. 
The spins are random, but not independent — the spin of one particle exerts influence on the spin 
of another. The joint law of the N spins is a matter of great interest and intrigue. Various models 
have been suggested over the years for various situations. Some of these models, like the famous 
Ising model, are deterministic in the sense that none of the model parameters are random, while 
some others, like the Sherrington-Kirkpatrick model which we shall discuss here, involve random 
variables as model parameters. 

All models assign a probability proportional to exp(—/3H]y(cr)) to the configuration a, where 
Hn is the Hamiltonian, and (3 = 1/T, T being the temperature. The partition function is = 

exp(— {3Hn(ct)), and the free energy is the log of the partition function divided by N. The 
asymptotic behaviour of the free energy is of great consequence and interest to physicists, and 
nowadays, to people in neural networks also. 

For a detailed discussion of mathematical results about spin glass models and further references, 
see Talagrand for instance. 

3.2 The Sherrington-Kirkpatrick model 

The Sherrington-Kirkpatrick (S-K) model, introduced in [22], can be briefly described as follows: 
For each N > 1 let {dfj, 1 < i, j < N} be a collection of i.i.d. iV(0, 1) random variables. The S-K 
model assigns a random probability distribution (the Gibbs measure) on S^r as follows: For any 
configuration a G Sat, the probability of the system being in the state a = (<ti, • • • ,&n) is given 
by 
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where Hjtf t g(<r) = — -7= J2i<j 3ij a i 'j ~ hJ2i<N P an< ^ ^ are fi xe d parameters, and Zpj,3 is the 
normalising constant. Ideally, the subscripts should include /? and /i, but we are considering them 
to be fixed. It has been shown by Guerra and Toninelli |12| that the limit 

lim lE(logZ JVia ) 

N— >oo iv 

exists for all (3 and h. See Talagrand [231 Theorem 2.10.1, p. 140 for a proof. A formula for the 
limit was conjectured by Parisi and proved by Talagrand [21]. Talagrand ([231 Corollary 2.2.5, p. 
32) also proves (in particular) that 

^(logZ N>d -ElogZ N>3 )^0 

for any (5 and h. Both the above facts were proved under the condition that 3fj are i.i.d. N(0, 1). 
In fact, the rigorous proofs involve the use of intricate properties of Gaussian random variables. 
Recently, in a paper which was archived at a time when this article was being written, Carmona 
and Hu [Hj have proved that the limit will exist and be the same when Sij are i.i.d. with zero mean, 
unit variance and finite third moment. Their technique may be extended to the case of independent 
variables with uniformly bounded third absolute moments. 

We shall derive a sufficient condition for invariance of the limiting free energy, which is weaker 
than the condition given by Carmona and Hu, and includes the case where Jy's are i.i.d. with zero 
mean and unit variance, with no assumption about the third moment. 



3.3 Our condition 

Let 3" = {f a : er G {-1, 1}^}, where 

i<j i 

Then clearly, X 2 (T) = (3 2 N- 3 , A 3 (J) = /3 3 7V" 9 / 2 and \$\ = 2 N . Now, if we define F(x) = 
JV-MogEo-e^'W], then by Theorem Ol X 2 (F) < Z(3 2 N~ 2 and X 3 (F) < 13/? 3 iV"- 5 / 2 . 

Suppose 3 and 3' are collections of independent random variables with zero mean and unit 
variance. If we let Un = F{3) and V/v = F(3'), then by Theorem II. H for any thrice differentiable 
g : M — > R and any fixed e > 0, \Eg(Upf) — Ep(Viv)| is bounded by a constant multiple (depending 
only on g and 0) of 

TV" 2 t E (4; \H > + E (3?f \3ijl > eVN)\ + e. 

l<i<j<N 

This shows that the limit of the free energy is the same as that in the i.i.d. standard Gaussian case 
whenever Sij's are independent with zero mean and unit variance, and satisfy 

Ve>0, lim N- 2 ^ ^(3l;\3ij\>€^N) = 0. (8) 

l<i<j<N 
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Note that this is almost exactly condition (JJJ), the only difference being that here we do not have 
terms corresponding to i = j. In particular, it is satisfied when dij's are i.i.d. with zero mean and 
unit variance. 

Under the assumption of uniformly bounded third absolute moments, Corollary 11.21 can be 
applied to get an explicit error bound of order iV -1 / 2 , which is the same as that obtained by 
Carmona and Hu 6 . 

4 Ground state of the S-K model 

The ground state in a spin glass model is the configuration which minimizes the Hamiltonian. With 
(3 = 1 and h = for simplicity, the energy of the ground state is given by 

S N (3) = max V] dijViVj- 

l<i<j<N 

Guerra and Toninelli HUE] proved that N~ 3 / 2 S N (d) converges almost surely and in average to a 
deterministic limit if 3 is a collection of standard Gaussian random variables. It was extended to 
the case of i.i.d. entries with zero mean, unit variance and finite third moment by Carmona and 
Hu 0. We shall show that convergence in probability and in average to the same limit would hold 
if dij's were independent and satisfied the same condition as in the previous section. 

Let £F, 3 and 3' be as in the previous section, with {3 = 1 and h = 0. If we let Un = max^ / CT (3) 
and Vn = max ff fa-(3'), then by Theorem II .41 for any thrice differentiable g : M. — > R and any fixed 
K > and a > 1, |E<7([/tv) — Eg(V/v)l i s bounded by a constant multiple (depending only on g) of 

a- x N + aN- 3 ^[E(0^; %j\ > K) + E(ag; \3' %j \ > K)\ + a 2 N~^ 2 K. 

i<j 

Now choose any A > 1 and e > 0, and put a = AN and K = e^/N . Substituting these values in 
the above expression, we get 

A-i + AN -2 ^{3%; \3ij\ > e^N) + E(0g; %j\ > eVN)} + A 2 e. 

i<j 

Thus, under condition ((BJ of the previous section, limsupjv^^ \Eg(U^) — Kg(VN)\ < A -1 + A 2 e. 
This proves the claim, since A and e are arbitrary. 

Again, Corollary II .51 can be applied to obtain an error bound of order iV -1 / 6 under the assump- 
tion of uniformly bounded third absolute moments. 

5 Proofs 

Proof of Theorem II .11 As mentioned before, the proof is just an easy extension of Lindeberg's 
argument for the classical central limit theorem. Fix / and g as in the statement of the Theorem. 
Let h = g o /. Then observe that 

a?fc(x) = 5 '(/(x))a i 2 /(x) + . 9 "(/(x))(a J /(x)) 2 , 

3?fc(x) = g'(/(x))af /(x) + 3. 9 "(/(x))a i /(x)a 2 /(x) + .g"'(/(x))(3 4 /(x)) 3 . 
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It follows that for any i and x, \dfh(x)\ < C\\2{f) and \dfh(x)\ < 66^3(7), where C\ = 

lls'lloo + ll^'lloo and C 2 = l\\g'\\oo + ^ll^'Hoo + g||5'"||oo- 

Next, for < i < n, define Zj := (Xi, . . . , X^x, Xi,Y i+ i, . . . ,Y n ) and W, := (Xi, . . . ,Xi-i,0,Y i+ i, 
with obvious meanings for i = and n. For 1 < i < n, define 

Ri := hiZ^-XidMW^-^XfdfhiWi), 

Ti := h(Z l ^ 1 )-Y i d i h(W i )-W l 2 dfh(W l ). 

By third order Taylor expansion and the bounds on the third partials of h obtained above, we 
immediately see that \Ri\ < C2As(/)|Xj| 3 and |Tj| < C2A3(/)|Yj| 3 . Second order bounds, on the 
other hand, imply that \R^\ < CiA2(/)|Xj| 2 and |Tj| < CiA2(/)|^| 2 - Now for each i, Xi, Y{ and 
Wj are independent. Hence 

E(X i O i /(W i ))-E(F^/(W i )) = nXi-YmPifWi)) = o- 

Similarly, K(Xfdff(Wi)) — E(Yj 2 9 2 /(W.j)) = 0. Combining all these observations we have, for any 
K > 0, 

n 

\Eg(U)-Eg(V)\ = | ^E(/t(Zj) — /i(Zj_i))| 

i=\ 
n 

= | E(Xidih(Wi) + -X?d*h(Wi) + Ri) 

i=l 

n 

- ^E(y i ft/i(w i ) + -y i 2 9^ (w . ) + T . ) | 

i=l 

n 

< C 1 A 2 (/)^[E(X 2 ; \Xi\ > K) +E(y i 2 ; |y| > K)\ 
i=l 

n 

+ C 2 A 3 (/)^[E(|X i | 3 ;|X i | <^)+E(|y| 3 ;|y| < if)]. 
i=l 

The corollary follows by taking K — > oo. □ 

Proof of Theorem 11.31 We begin by defining a bunch of functions. The domains will be 
clear from the definitions. Let 





:= e Q/(x) , 


Z(x) 








p(x,/) 


:= Z(x)-V(x,/), 


Qi(x,/) 


:= adif(x), 


ej(x) 


:= ^Oi(x, /)p(x, /) 
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Note that for any x, p(x, •) is a probability on 9". This will be widely used without mention in 
obtaining the bounds below. Also, note that F(x) = a" 1 logZ(x). 

We shall now find bounds on the partial derivatives of several orders for these functions. Func- 
tion arguments will be suppressed for clarity. First, note that clearly from the given expressions, 

diip = cii^p, (9) 

diZ = y, 9 ^ = z2 a ^ = Zei - ( 10 ) 

Using © and (fTU|) and the expression for p we get 

Za^-Ze^ t s 

Now, directly from the expression for e, we get 

Qi^i = y ^{pd i ai + ciidip), (12) 

df ei = J2(pdfa i + 2{d i a i ){d iP ) + a i ^p). (13) 

Using (HU) and (JEJ we get 

dfp = (ditti - diei)p + (a { - e { ) 2 p. (14) 

Now for 1 < r < 3, let C r = sup{|3[/(x)| : 1 < i < n, / G 3", x £ /"}. Then note that for any i we 
have the uniform bounds 

< aC\, \didi\ < aC2, \dfa,i\ < aC^ (15) 

In the following, we shall freely use the assumption that a > 1. The first inequality above imme- 
diately gives 

\ei\<ad. (16) 

From dTTJ), (JTSJ) and (UHJ), we get 

\d iP \ < 2aC lP . (17) 

Using (O, (HSJ and (Q7J) we get 

\d i e i \<a 2 {C 2 + 2Cl). (18) 

Using (dU), (JEJ, (0 and JIHJ) we get 

<a 2 (2C 2 + 6C 1 2 )p. (19) 

Using (JT3J), ifTB). (fT7|l and (UHJ) we have 

|<9?e*| < a 3 (C 3 + 6CiC 2 + 6Cf). (20) 
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The proof is completed by observing that <9jF = a l di\ogZ = a and using the bounds (|16[). 
(TTSll and (1201) in Definition O □ 

Proof of Theorem EH For each a > 1, let F a (x) = a -1 logE/ e ?e a/(x) ]. Also, let F(x) = 
maxj g gr/(x). Then we have 

F(x) = a^log^^^/M] 

< a' 1 log[J3 e a ^ x )] 

< a- 1 log[|g"|e Qmax ^ :T/(x) ] 

which gives the uniform bound 

|F(x)-F a (x)| < aT 1 log|5F|. 

Thus, by Theorem I l.M[ for any K > 0, 

|E 5 (F(X))-Eg(F(Y))| < 2|| f7 / || 00 «- 1 log |9"| + 3aC 1 (g)\ 2 (?)T 1 (K) 

+ 13a 2 C 2 (g)\ 3 (?)T 2 (K) 

where Ti^) = Z7=M X h\ X i\ > K)+E(Y?;\Yi\ > K)] and T 2 (K) = £r=i[ E (l^| 3 ; \H < 
K) + E(|li| 3 ; \Yi\ < K)}. If 7 < oo, then we can let K -> oo and get 

|E 5 (F(X)) -E 5 (F(Y))| < 2|| 5 / || 00 a- 1 log|J| +26a 2 C 2 ( 5 )A 3 (J)7^ 

Now choose a = [( 7 nA 3 ( J))" 2/3 (log I^D^+l] 1 / 2 . Note that a > landed 1 < (7nA 3 (J)) 1 / 3 (log 1 3~| ) 1 / 3 . 
The Corollary follows from this. □ 

Acknowledgement. The author thanks Persi Diaconis for helpful comments and encourage- 
ment, and Erwin Bolthausen for communicating the work of Carmona and Hu. 
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